智能论文笔记

Diffusion Autoencoders: Toward a Meaningful and Decodable Representation

Konpat Preechakul , Nattanat Chatthee , Suttisak Wizadwongsa , Supasorn Suwajanakorn

分类：计算机视觉

2021-11-30

扩散概率模型（DPMS）在竞争对手GANS的图像生成中取得了显着的质量。但与GAN不同，DPMS使用一组缺乏语义含义的一组潜在变量，并且不能作为其他任务的有用表示。本文探讨了使用DPMS进行表示学习的可能性，并寻求通过自动编码提取输入图像的有意义和可解码的表示。我们的主要思想是使用可学习的编码器来发现高级语义，以及DPM作为用于建模剩余随机变化的解码器。我们的方法可以将任何图像编码为两部分潜在的代码，其中第一部分是语义有意义和线性的，第二部分捕获随机细节，允许接近精确的重建。这种功能使当前箔基于GaN的方法的挑战性应用，例如实际图像上的属性操作。我们还表明，这两级编码可提高去噪效率，自然地涉及各种下游任务，包括几次射击条件采样。

translated by 谷歌翻译

Neural basis functionsReflectance coefficients Figure 1: (a) Each pixel in NeX multiplane image consists of an alpha transparency value, base color k 0 , and view-dependent reflectance coefficients k 1 ...k n . A linear combination of these coefficients and basis functions learned from a neural network produces the final color value. (b, c) show our synthesized images that can be rendered in real time with view-dependent effects such as the reflection on the silver spoon.

translated by 谷歌翻译